FIRE 2019

Forum for Information Retrieval Evaluation

Indian Statistical Institute, Kolkata

12th - 15th December

The Impact of an Era of Information Ubiquity on the Design and Evaluation of Information Retrieval System

Nicholas Belkin, Rutgers University, USA

In the emerging technological and social-technical environment, people will be (are already) constantly and ubiquitously emerged in sea of information. The field of information retrieval (IR), and especially, of interactive IR (IIR), needs to construe them as such, and not merely, only, or even as “users” who will stop doing what they’re doing, to engage in an IR system. In this presentation, I identify some characteristics of this rapidly developing environment that are especially salient for how we should conceive of support of people in their interactions with information in this milieu, and propose a framework which specifies factors which will need to be considered in the design of methods and circumstances for effective support. The resulting understanding of support of interaction with information in the era of information ubiquity has strong implications not just for IR systems per se, but also for just how the effectiveness and usefulness of such support should and could be evaluated. It is also the case that major ethical issues arise in this context; these implications and issues are raised and discussed, with the aim of encouraging the IR community to attend to them before it’s too late to have effect.


Thinking about Evaluation in Today's Information Access World

Donna Harman, National Institute of Standards and Technology, USA

If we could look back 50 years, we would see an extremely limited information access environment. Information could only be found using librarians who supplemented their knowledge using manually-produced indexes (for some fields), or possibly via one's colleagues. Cyril Cleverdon from Cranfield England conducted research on indexing methodologies in the early 1960s, resulting in the proof that manual indexing was NOT needed; he also developed an evaluation methodology that was quickly adopted by the information retrieval community. Nevertheless, information access remained restricted throughout the 70s and 80s due to lack of computer power and equally important, lack of information in electronic form.
By the early 1990s the computers were finally powerful enough and there was reasonable amounts of text in electronic form. TREC in 1992 showed that the ranking algorithms developed in the 60s and 70s actually scaled up to handle large amounts of full-text documents. The Internet (1990) and the new search engines such Excite, Altavista, and Google came along, finally allowing end users decent search capability. Web sites exploded with content, blogging and twittering started, and the information access world was suddenly vast, including multimedia, e-commerce sites, online encyclopedias, huge archives of scanned material, etc. etc.
This talk will address how the evaluation technologies developed back in the 60s have adapted, where possible, to today's world. Central to all good evaluation in information retrieval is a clear understanding of the underlying goal of the task to be evaluated. Ideally one could conduct huge user studies or study interleaved results similar to the big commercial search engines, however, this is not usually possible and therefore the fallback is with test collections. The design of these test collections, and the selection of appropriate metrics, are critical to a successful evaluation. Numerous case studies from the various public evaluations will be analyzed with respect to their effectiveness and also their inherent biases.


Miscommunication in social media: Bots, Hate and Fake.

Paolo Rosso, Universitat Politècnica de València, Spain

The relative anonymity of social media facilitates the propagation of toxic, hate and exclusion messages. Therefore, social media contribute to the polarization of society, as we have recently witnessed in events such as the last presidential elections in US, the Brexit and the Catalan referendums. Moreover, social media foster information bubbles and echo chambers, and every user may end up receiving only the information that matches her personal biases and beliefs. A perverse effect is that social media are a breeding ground for the propagation of fake news: when a piece of news matches with our beliefs or outrages us, we tend to share it without checking its veracity. In this talk I will address the two above problems and the key role that bots have. Finally, I will describe shared tasks that have been recently organised to identify bots and to detect hate speech and fake news.


Evaluating Personalised Search and Recommendation: two sides of the same coin ?

Gabriella Pasi, Università degli Studi di Milano-Bicocca, Italy

Since the publication in 1992 of the article by Nick Belkin and Bruce Croft titled “Information Filtering and Information Retrieval: two sides of the same coin?”, the task of Information Filtering (IF) has evolved into a rich and coherent research area, giving rise to one of today’s most widespread technologies, i.e. Recommender Systems. On the Information Retrieval (IR) side, the development of methods for Personalized Search has exploited the key role of users and user-systems interactions in the search process, thus making closer, at some extent, IR and IF. The aim of this talk is to present an analysis of both similarities and differences in the tasks of evaluating (Personalized) Information Retrieval Systems and Recommender Systems. An overview will be presented of the methodologies and measures defined to evaluate the effectiveness of the above categories of systems, with the aim of possibly disclosing new perspectives.